Adding timeout for relay connection by sergey3bv · Pull Request #681 · MostroP2P/mostro

sergey3bv · 2026-03-24T10:31:27Z

Summary by CodeRabbit

New Features
- Added an optional configuration key to control Nostr relay connection timeout (defaults to 5 seconds; documented but not enabled by default)
- Improved relay connection handling: waits for readiness, applies the timeout, and surfaces an error if no relays connect

coderabbitai · 2026-03-24T10:31:42Z

Walkthrough

Added a configurable relay_connection_timeout_secs (default 5s) to Nostr settings, updated test fixture to include it, added commented example to the settings template, and changed connection logic to wait for relay readiness and verify at least one relay is connected before returning the client.

Changes

Cohort / File(s)	Summary
Configuration Structure `src/config/types.rs`	Added `relay_connection_timeout_secs: u64` to `NostrSettings` with `default_relay_connection_timeout_secs()` Serde default returning 5.
Configuration Template `settings.tpl.toml`	Added two commented lines documenting `relay_connection_timeout_secs` under the `[nostr]` section (default 5).
Connection Utility `src/util.rs`	`connect_nostr` now reads timeout from settings, calls `client.connect().await`, then `client.wait_for_connection(timeout).await`, and returns an error if no relays are connected.
Test Fixtures `src/app/context.rs`	Updated `test_utils::test_settings()` to set `nostr.relay_connection_timeout_secs` to `5`.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant Config as Configuration
    participant Util as Connection Utility
    participant Client as Nostr Client

    App->>Config: Load settings (relay_connection_timeout_secs)
    Config-->>App: Return Settings (timeout = 5s)
    App->>Util: connect_nostr(nostr_settings)
    Util->>Util: derive timeout from settings
    Util->>Client: client.connect()
    Client-->>Util: connection initiated
    Util->>Client: client.wait_for_connection(timeout)
    Client-->>Util: connected relays count
    alt >=1 relay connected
        Util-->>App: Ok(client) (ready)
    else 0 relays connected
        Util-->>App: Err(No relays connected)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Five seconds I nibble, then hop to the gate,
Waiting for relays to answer—oh how I wait!
A setting now whispers when patience is due,
I twitch my nose, connect, and bounce back to you. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Adding timeout for relay connection' is directly related to the main changes, which add a new `relay_connection_timeout_secs` configuration parameter and implement timeout-based connection handling in the `connect_nostr` function.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/util.rs`:
- Around line 782-785: After calling client.connect().await and
client.wait_for_connection(timeout).await in connect_nostr, verify that at least
one relay actually connected and return an Err when none did; specifically,
after wait_for_connection(timeout) check the client's relay connection
count/status (e.g., using the client instance methods that expose connected
relays or relay statuses) and if zero connected relays, log/return an error
instead of returning Ok(client), otherwise continue returning Ok(client).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 13c64bfc-80fe-44ef-92bb-3ee0b48aa2f3

📥 Commits

Reviewing files that changed from the base of the PR and between 9a69ac4 and 00b7977.

📒 Files selected for processing (4)

settings.tpl.toml
src/app/context.rs
src/config/types.rs
src/util.rs

src/util.rs

coderabbitai

🧹 Nitpick comments (1)

src/util.rs (1)
794-799: Consider startup resilience for future enhancement.

Currently, if all configured relays are temporarily unreachable at startup (e.g., transient network issue), Mostro will fail to start. This is consistent with the database connection behavior (both use fail-fast patterns).

For future consideration: adding retry with exponential backoff before failing could improve resilience against brief network disruptions during deployment/restart. However, the current behavior is reasonable for the PR's scope of adding timeout handling.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/util.rs` around lines 794 - 799, The code returns an error immediately
when connected == 0 after waiting nostr_settings.relay_connection_timeout_secs,
which makes startup fail-fast; instead implement a retry loop with exponential
backoff before returning Err: wrap the existing check around a limited number of
retries (configurable via a new setting like
nostr_settings.relay_connection_retry_attempts) and on each failure wait an
exponentially increasing delay (base + jitter) before re-checking connected;
keep the existing error path as a final fallback and log each retry attempt
including the current attempt number, delay, and
nostr_settings.relay_connection_timeout_secs so callers can trace retries.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/util.rs`:
- Around line 794-799: The code returns an error immediately when connected == 0
after waiting nostr_settings.relay_connection_timeout_secs, which makes startup
fail-fast; instead implement a retry loop with exponential backoff before
returning Err: wrap the existing check around a limited number of retries
(configurable via a new setting like
nostr_settings.relay_connection_retry_attempts) and on each failure wait an
exponentially increasing delay (base + jitter) before re-checking connected;
keep the existing error path as a final fallback and log each retry attempt
including the current attempt number, delay, and
nostr_settings.relay_connection_timeout_secs so callers can trace retries.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 373a80cc-c58f-4b2f-a2d5-8bb022091908

📥 Commits

Reviewing files that changed from the base of the PR and between 00b7977 and cdebde0.

📒 Files selected for processing (4)

settings.tpl.toml
src/app/context.rs
src/config/types.rs
src/util.rs

✅ Files skipped from review due to trivial changes (1)

settings.tpl.toml

🚧 Files skipped from review as they are similar to previous changes (1)

src/app/context.rs

Catrya

Thanks for the contribution @sergey3bv
However, this PR doesn't address the issue it references (mobile #548). That issue is about the Mostro Mobile app hanging when a relay is down, it's an app side problem caused by dart_nostr connecting to relays sequentially with no effective connection timeout.

The mostro daemon uses nostr-sdk, which handles this correctly: client.connect().await is non-blocking, it spawns independent background tasks per relay in parallel with per relay timeouts, so a dead relay never blocks the others.
If all relays are down at startup, operations like send_event simply fail individually while nostr-sdk keeps retrying with adaptive backoff, and reconnects automatically when relays come back up. Currently, when a relay is down, mostrod continues working normally, you can see the failing relay in the logs but it doesn't affect operations.

So the scenario this PR guards against is already handled gracefully. I'd suggest we close this one to keep things simple and avoid adding a config field and logic for something already covered, but as you consider @grunch @arkanoider

sergey3bv · 2026-03-26T06:59:55Z

That issue is about the Mostro Mobile app

Indeed, it was not there when I started working, so the context was different

arkanoider · 2026-03-26T14:29:54Z

Thanks for the contribution @sergey3bv However, this PR doesn't address the issue it references (mobile #548). That issue is about the Mostro Mobile app hanging when a relay is down, it's an app side problem caused by dart_nostr connecting to relays sequentially with no effective connection timeout.

The mostro daemon uses nostr-sdk, which handles this correctly: client.connect().await is non-blocking, it spawns independent background tasks per relay in parallel with per relay timeouts, so a dead relay never blocks the others. If all relays are down at startup, operations like send_event simply fail individually while nostr-sdk keeps retrying with adaptive backoff, and reconnects automatically when relays come back up. Currently, when a relay is down, mostrod continues working normally, you can see the failing relay in the logs but it doesn't affect operations.

So the scenario this PR guards against is already handled gracefully. I'd suggest we close this one to keep things simple and avoid adding a config field and logic for something already covered, but as you consider @grunch @arkanoider

@sergey3bv thanks for your effort, but I confirm as @Catrya pointed out that rust nostr manages the connection to relay in a smooth way, every connection is spawned in a separated thread/task.

arkanoider

Thanks for the contribution @sergey3bv However, this PR doesn't address the issue it references (mobile #548). That issue is about the Mostro Mobile app hanging when a relay is down, it's an app side problem caused by dart_nostr connecting to relays sequentially with no effective connection timeout.

The mostro daemon uses nostr-sdk, which handles this correctly: client.connect().await is non-blocking, it spawns independent background tasks per relay in parallel with per relay timeouts, so a dead relay never blocks the others. If all relays are down at startup, operations like send_event simply fail individually while nostr-sdk keeps retrying with adaptive backoff, and reconnects automatically when relays come back up. Currently, when a relay is down, mostrod continues working normally, you can see the failing relay in the logs but it doesn't affect operations.

So the scenario this PR guards against is already handled gracefully. I'd suggest we close this one to keep things simple and avoid adding a config field and logic for something already covered, but as you consider @grunch @arkanoider

@sergey3bv thanks for your effort, but I confirm as @Catrya pointed out that rust nostr manages the connection to relay in a smooth way, every connection is spawned in a separated thread/task.

grunch · 2026-03-26T15:00:05Z

closing this PR

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

src/util.rs Show resolved Hide resolved

feat: timeout for relay connection

cdebde0

sergey3bv force-pushed the fix/handling-down-relay branch from 00b7977 to cdebde0 Compare March 24, 2026 10:43

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

grunch requested review from Catrya and arkanoider March 25, 2026 22:46

Catrya reviewed Mar 26, 2026

View reviewed changes

arkanoider reviewed Mar 26, 2026

View reviewed changes

grunch closed this Mar 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding timeout for relay connection#681

Adding timeout for relay connection#681
sergey3bv wants to merge 1 commit intoMostroP2P:mainfrom
sergey3bv:fix/handling-down-relay

sergey3bv commented Mar 24, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 24, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Catrya left a comment

Uh oh!

sergey3bv commented Mar 26, 2026

Uh oh!

arkanoider commented Mar 26, 2026

Uh oh!

arkanoider left a comment

Uh oh!

grunch commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sergey3bv commented Mar 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Catrya left a comment

Choose a reason for hiding this comment

Uh oh!

sergey3bv commented Mar 26, 2026

Uh oh!

arkanoider commented Mar 26, 2026

Uh oh!

arkanoider left a comment

Choose a reason for hiding this comment

Uh oh!

grunch commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sergey3bv commented Mar 24, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 24, 2026 •

edited

Loading